agentic rl

Cover image for Enter the Scaling of RL Environments

Apr 9, 2026 · 29 min read Intelligence Cartography

The environment is no longer a passive test harness. It is a data engine. 10 dimensions of scaling, from task generation to multi-agent self-play.

Read Post

Feb 20, 2026 · 45 min read Intelligence Cartography

Re-visiting Mid-training Stage: for & with Agentic RL

Re-examining mid-training as the strategic centerpiece of the LLM pipeline — how it builds the knowledge foundation for agentic RL, and how RL signals are now flowing backward to improve mid-training itself

Read Post

Feb 13, 2026 · 38 min read Intelligence Cartography

Inside the Agentic RL Training Loop

A Step-by-Step Walkthrough using Slime and SWE-Bench as an Example

Read Post

Feb 8, 2026 · 43 min read Intelligence Cartography

RL Infra for Large-Scale Agentic Training

From GPU Memory Budgets to Framework Architectures

Read Post